Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis
نویسندگان
چکیده
Feature selection is an effective approach to reducing dimensionality by selecting relevant original features. In this work, we studied a novel problem of multi-source feature selection for unlabeled data: given multiple heterogeneous data sources (or data sets), select features from one source of interest by integrating information from various data sources. In essence, we investigate how we can employ the information contained in multiple data sources to effectively derive intrinsic relationships that can help select more meaningful (or domain relevant) features. We studied how to adjust the covariance matrix of a data set using the geometric structure obtained from multiple data sources, and how to select features of the target source using geometry-dependent covariance. We designed and conducted experiments to systematically compare the proposed approach with representative methods in our attempt to solve the novel problem of multi-source feature selection. The empirical study demonstrated the efficacy and potential of multi-source feature selection.
منابع مشابه
Comprehensive causal analysis of occupational accidents’ severity in the chemical industries; A field study based on feature selection and multiple linear regression techniques
Introduction: The causal analysis of occupational accidents’ severity in the chemical industries may improve safety design programs in these industries. This comprehensive study was implemented to analyze the factors affecting occupational accidents’ severity in the chemical industries. Methods and Materials: An analytical study was conducted in 22 chemical industries during 2016-2017. The stu...
متن کاملMulti-task Feature Selection based Anomaly Detection
Network anomaly detection is still a vibrant research area. As the fast growth of network bandwidth and the tremendous traffic on the network, there arises an extremely challengeable question: How to efficiently and accurately detect the anomaly on multiple traffic? In multi-task learning, the traffic consisting of flows at different time periods is considered as a task. Multiple tasks at diffe...
متن کاملReal-time Pedestrian Detection Using a Boosted Multi-layer Classifier∗
Techniques for detecting pedestrian in still images have attracted considerable research interests due to its wide applications such as video surveillance and intelligent transportation systems. In this paper, we propose a novel simpler pedestrian detector using state-of-the-art locally extracted features, namely, covariance features. Covariance features were originally proposed in [1,2]. Unlik...
متن کاملDiagonalization of time-delayed covariance matrices does not guarantee statistical independence in high-dimensional feature space
Independent Slow Feature Analysis (ISFA) is an algorithm for performing nonlinear blind source separation, which combines linear ICA with Slow Feature Analysis (SFA). In its current form the objective function is based on time-delayed covariance matrices. While the algorithm performs well in general, we occasionally encountered cases in which the estimated sources are highly statistically depen...
متن کاملMulti-Label Classification Using Dependent and Independent Dual Space Reduction
While multi-label classification can be widely applied for problems where multiple classes can be assigned to an object, its effectiveness may be sacrificed due to curse of dimensionality in the feature space and sparseness of dimensionality in the label space. As a solution, this paper presents two alternative methods, namely Dependent Dual Space Reduction and Independent Dual Space Reduction,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008